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Abstract 



< 

We present a comparative study of two methods for the reduction of the dimension- 
ality of a system of ordinary differential equations that exhibits time-scale separa- 
tion. Both methods lead to a reduced system of stochastic differential equations. The 
novel feature of these methods is that they allow the use, in the reduced system, of 
higher order terms in the resolved variables. The first method, proposed by Majda, 
Timofeyev and Vanden-Eijnden, is based on an asymptotic strategy developed by 
| Kurtz. The second method is a short-memory approximation of the Mori-Zwanzig 

projection formalism of irreversible statistical mechanics, as proposed by Chorin, 
Hald and Kupferman. We present conditions under which the reduced models aris- 
CTN \ ing from the two methods should have similar predictive ability. We apply the two 

methods to test cases that satisfy these conditions. The form of the reduced models 
and the numerical simulations show that the two methods have similar predictive 
. ability as expected. 

Key words: Mori-Zwanzig formalism, Stochastic mode reduction, scale separation, 
stochastic equations. 



1 Introduction 



In recent years have appeared different methods for reducing the dimensional- 
ity of a system of ordinary or stochastic differential equations. The realization 
that despite the rapid increase of computational power there are many prob- 
lems that are too expensive to tackle directly, or that in several problems the 
objects of interest are macroscopic, coarse-grained quantities, has led many 
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researchers to construct methods for extracting models of reduced dimension- 
ality. Since the motivation to develop such methods comes usually from specific 
physical problems, most methods exploit the mathematical structure of the 
problem at hand, and thus are not suited for all problems where dimensional 
reduction is needed. 

Since there exists a whole range of dimension reduction methods, it is inter- 
esting to compare the behaviour of the reduced models predicted by different 
methods in examples that are inspired by problems of physical interest. The 
purpose of the present work is to report some results of the application of 
the asymptotic mode reduction strategy (AMRS) and of the short-memory 
Mori-Zwanzig approximation (short-memory MZ) to a collection of models 
that share common features with more complicated systems that appear in 
the study of climate dynamics. These non-trivial test cases appeared in [1] 
where an analysis of their behaviour and of the performance of AMRS can be 
found. The analysis here suggests that the asymptotic mode reduction strategy 
and short-memory MZ approximation have, under conditions that we discuss 
in Section 4, similar predictive ability. The test cases satisfy these conditions 
and thus, we expect the reduced models arising from the two methods to have 
similar predictive ability. The form of the reduced models and the numerical 
simulations support our expectations. 

The majority of the methods for dimensional reduction rely on a separation of 
time-scales between the variables of interest and the rest of the variables in the 
system under investigation. This separation of scales can be used in different 
ways depending on the form of the solutions for the fast dynamics. Thus, in 
the case of inertial manifold methods [2], the fast variables are, after a short 
transient, slaved to the slow ones. In the case of methods that are known as 
"averaging methods" [3], the fast variables are allowed to have more complex 
behaviour and affect the slow variables through the empirical measure on the 
fast dynamics. The resulting equations for the slow variables are deterministic. 
In the case of the asymptotic mode reduction strategy [4], the fast variables 
reach a statistical equilibrium much faster than the slow variables and this 
is used to obtain (in a certain limit) the reduction of the system's dimen- 
sion. The resulting system of equations for the slow variables is stochastic. In 
the case of [5] the resolved variables are coupled to a heat bath that is then 
approximated using the trigonometric representation for Gaussian processes. 
Chorin, Hald and Kupferman [6,7], have proposed the use of the Mori-Zwanzig 
projection formalism of irreversible statistical mechanics, for the reduction of 
the dimensionality of a system of ordinary differential equations. In contrast 
to the previous methods, the Mori-Zwanzig projection formalism results in a 
reformulation of the equations for the resolved variables. This does not make 
the problem of dimensional reduction easier. However, the reformulation serves 
as a starting point for approximations of various degrees of sophistication de- 
pending on the properties of the problem at hand. Finally, we should note 
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the work of Papanicolaou [8,9,10,11] which uses a projection formalism (sim- 
ilar to Mori-Zwanzig, albeit with a different projection operator) on the level 
of the Fokker-Planck and backward Fokker-Planck (Chapman-Kolmogorov) 
equations. 

Every mode reduction method has two parts: a) identify the variables that will 
constitute the reduced description and b) find the equations for these variables. 
The techniques mentioned above avoid the first part by assuming that the 
variables to be picked are known in advance. However, knowledge of what 
are the "right" variables to pick in an arbitrary system is a very important 
and difficult problem, since different combinations of variables can lead to 
very different reduced equations. This is an important issue for the efficient 
numerical implementation of any mode reduction method. Techinques based 
on proper orthogonal decomposition [12], coarse-grained integration [13] and 
the transfer operator [14] attempt not only to perform dimension reduction, 
but also to identify the appropriate variables that will constitute the reduced 
system. A concise exposition of all the methods mentioned above along with 
applications to models can be found in [15]. 

The paper is organized as follows. In Section 2 we present a summary of the 
asymptotic mode reduction strategy as it is applied to a system of ordinary 
differential equations. In Section 3 we present the Mori-Zwanzig projection 
formalism and the short-memory approximation arising from it which is ap- 
propriate for the examples that we will be studying. This approximation serves 
to illustrate some analogies and differences between the two methods and we 
present them in Section 4. We also present conditions under which the two 
methods are expected to have similar behaviour. In Section 5 we present the 
equations for the test cases that we will be examining. Section 6 collects the re- 
sults of the application of the two methods to the test cases. A short discussion 
follows in Section 7. 



2 Asymptotic mode reduction strategy 

Suppose we are given a system of ordinary differential equations 



with initial condition 0(0) = x. The asymptotic mode reduction strategy [4,16] 
is a two step procedure based on the assumption that the set of variables of 
the system have been split into two subsets, the resolved and unresolved. The 
objective is to construct equations for the evolution of the resolved variables by 
eliminating the unresolved variables. The splitting of the variables in resolved 
and unresolved is not arbitrary but relies on the assumption that there exist 
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two sets of variables in the system that evolve on different time-scales. The 
slow variables are identified as resolved ones while the fast variables as unre- 
solved. In the first step, the equations for the unresolved variables are modified 
by replacing the nonlinear self-interaction terms between the unresolved vari- 
ables by stochastic terms. The motivation for this is that the self-interaction 
terms govern the perturbations on short time-scales, and if we are interested 
in coarse-grained modelling on longer time-scales, these terms can be replaced 
accurately by stochastic terms. It is important to note that this replacement 
is an assumption that has to be checked case by case. The second step uses 
projection techniques for stochastic differential equations [17,18,19,8] to elim- 
inate the fast variables. The elimination procedure can be made rigorous in 
the limit of infinite separation of time-scales between resolved and unresolved 
variables (see e.g. [17]). 

We present a summary of the two steps of the method for a system of ordinary 
differential equations. We will focus on equations whose form can describe all 
the examples that we will study. More general forms can be found in [4,15] 
and references there. 

For the first step we assume that we have obtained in some way the splitting 
of the variables in resolved and unresolved such that these two sets evolve 
on different time-scales. Suppose that the vector of system variables can be 
written as = (0, 0), where are the resolved variables and the unresolved 
ones. Similarly, the initial conditions can be written as x = (x,x). The first 
step is to represent the nonlinear self-interactions between unresolved variables 
in the equations for the unresolved variables with suitable stochastic terms. 
In general, it is not easy to justify such a replacement (see e.g. [20]). For 
the examples that we study, the justification has been numerical [1]. We will 
return to this point when we present the test cases. We rewrite the system (1) 
as 

d t (2) 
^ = i?(0) = iJ(0) + 5(0), 

where the term 5(0) contains the nonlinear self-interaction terms between the 
unresolved variables and H(<f>) contains the rest of the terms on the RHS of 
the equations for the unresolved variables. The first step of AMRS replaces 
the term S by the following stochastic expression 

— * 

where T, a are positive definite (diagonal in our case) matrices and W(t) is 
a vector- valued Wiener process. As mentioned in [4], the choice of this par- 
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ticular expression in not essential to the theory but it is convenient for the 
calculations because of the availability of explicit expressions for the inverse 
of the associated backward Fokker-Planck operator. The parameter e roughly 
measures the ratio of the correlation times of the unresolved variables to the 
resolved ones and its precise definition will be given later when we study the 
test cases. The stochastic approximation introduces the matrices T, a that 
have to be determined before we proceed with the second step of AMRS. 
The entries of T and a are chosen so as to minimize the difference between 
the statistical moments of the original system and those of the approximate 
stochastic system (we refer to this as statistical agreement). For our choice of 
stochastic approximation which introduces two parameters, the procedure will 
involve the one-time and two-time statistical properties of (1). We can use the 
fluctuation-dissipation theorem (one-time statistics) to reduce the number of 
parameters to be determined to only the entries of the matrix T. The matrix 
T is determined through the two-time statistics of (1). More details on how to 
determine T, a will be given when we present the test cases. 

With the stochastic approximation (3) the system (2) is replaced by 

I = R ^ 

a i (4) 

To facilitate the presentation, we can think of the following splitting of the 
RHS for the resolved variables, 

R(4>) = &>(</>) + eR 1 (j>), 

where R°(4>) is the fast part of the RHS that includes the contributions from 
the unresolved variables, while ei? x (0) is the slow part that comes only from 
interactions between the resolved variables. The asymptotic mode reduction 
strategy can be made rigorous in the limit of e — > 0. To prepare the equations 
for the final elimination of the unresolved variables, we coarse-grain the time 
t — > et and we have 

at e /g\ 

at e e 2 e at 

As can be seen from (5), the dynamics for <fi are an order of magnitude faster 
than the ones for 0. 

Before we proceed with the second step of AMRS that eliminates the un- 
resolved variables, we note again that the approximation (3) is a working 
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assumption that has to be checked case by case. However, if we assume that 
the approximation (3) is valid, then, due to a theorem by Kurtz [17], the pro- 
cess of elimination of the unresolved variables becomes rigorous in the limit of 
inifmite scale separation. The theorem operates on the level of the Chapman- 
Kolmogorov (or backward Fokker-Planck) equation associated with (5). 

When presented with a system of stochastic differential equations of the form 
(5) we can construct two linear partial differential equations, one for the evo- 
lution of the probability density function (Fokker-Planck equation) and one 
for the evolution of expectation values of functions of the solution (Chapman- 
Kolmogorov equation). The Chapman-Kolmogorov equation is the adjoint of 
the Fokker-Planck equation with respect to the scalar product defined on 
X = M n , where n is the dimensionality of the system (1) (see below). We shall 
focus our attention on the Chapman-Kolmogorov equation, since we are only 
interested in giving a brief account of AMRS (see [15] for more details). For 
any scalar- valued function g(x), we define the quantity 

u e {x,t) =Eg($). 

Then -u e satisfies the Chapman-Kolmogorov partial differential equation 

du e 1 „ c 1 _ c „ c 

— = -^du e + -£ 2 u € + C 3 u% 6 
at e 2 e 

with u e (x, 0) = g(x). Note that while u e (x, 0) is only a function of x, u e (x,t) 
is a function of all the variables. The operators £±, C 2 and C 3 are defined by 

a=EAV)^-+E^(-)^- (7) 

where 1%, 1% are index sets for the resolved and unresolved variables respec- 
tively. Also, 7j, (jj are the (diagonal) entries of the matrices T, a. 

The second step of AMRS eliminates the unresolved variables. From (6) we 
see that the limit e — ► is singular. Also, that (6) is a partial differential 
equation involving both resolved and unresolved variables. Thus, our aim is 
to derive (if possible), in the limit e — > 0, a Chapman-Kolmogorov equation 
for the resolved variables only and from this equation read off the form of the 
corresponding stochastic system for the resolved variables. We will use only 
formal manipulations that can be made rigorous in the limit of e — > through 
Kurtz's theorem. We proceed by expanding the solution u e (x,t) in powers of 
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u e (x, t) — uq + tui + e 2 «2 + . . . . (8) 

We insert (8) into (6) and collect terms of equal power in e. We obtain the 
following hierarchy of equations 

Ciuo = (9) 
C\Ui = -C 2 u (10) 

£iu 2 = _ ^2«i - £3^0 (11) 



Define a scalar product on X by 

(a, 6) = f a(x)b(x)dx, (12) 

for any two scalar- valued functions a, b. The solvability of the equations (9)- 
(11) requires that the RHS of the equations belong to the range of Ci, or 
equivalently that they are orthogonal (under the scalar product just defined) 
to the kernel of C\, the adjoint of C\. By construction, we have chosen the 
dynamics of C\ in such a way that the kernel of C\ contains only one element. 
This is the invariant density f s (x) under the dynamics governed by the linear 
stochastic operator in (3). Thus, the solvability condition for all the equations 
implies that the RHSs average to zero with respect to f s (x). Consequently, if 
we denote by P the projection on the subspace X of resolved variables, where 



P = 



= /. -f s (x)dx, 



we have I* (RHS) = 0. Also, the solvability of the equation (9) implies Puq = 
uo, i.e. Mo is independent of x (see [15]). From equation (7) and the definition 
of the projection P we have that P£i = C±P = (for more details see [4,15]). 
We collect these remarks and find that the function u satisfies the equation 

^ = p£ 3 p«o - PMr^Puo (13) 

uq(x,0) = g(x) 

where C^ 1 is the inverse of the operator C\ (see Appendix A in [4]). In the limit 
of e — > 0, Kurtz's theorem makes these manipulations rigorous by stating that 
under the condition P£ 2 P = 0, i.e the solvability condition for (10), the solu- 
tion u e (x,t) tends to u (x, t), where u satisfies (13). Moreover, equation (13) 
is a Chapman-Kolmogorov equation involving only the resolved variables and 
since C^ 1 is computable for the stochastic approximation (3), we can use (13) 
to compute the corresponding stochastic differential equations for the resolved 
variables. This concludes the second step of AMRS. Note, that in general, the 
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effective compuational use of the Chapman-Kolmogorov equation for the re- 
solved variables relies on the operator L\ having a closed form expression for 
its inverse. 



3 Mori-Zwanzig projection formalism and the short-memory ap- 
proximation 



We present the Mori-Zwanzig formalism and derive from it the approximation 
that we will be using later, namely the short-memory approximation. 



3. 1 Conditional expectations and the Mori-Zwanzig formalism 



Suppose that we are given the system of equations (1) with initial condition 
(f)(0) = x. Furthermore, assume that we know only a fraction of the initial 
data, say x, where x = (x, x) and correspondingly <p = (0, 0) and that the 
unresolved data are drawn from a measure with density f(x). Unlike the case 
of AMRS, here we do not assume anything about the form of the density f(x). 

Suppose u, v are functions of x, and introduce the scalar product (u, v) = 
E[uv] = J u(x)v(x)f(x)dx. We will denote the space of functions u with 
E[u 2 } < oo by L 2 (f) or simply L 2 . We are looking for approximations of 
functions of x by functions of x, where x are the variables that form our re- 
duced system (the resolved degrees of freedom). The functions of x form a 
closed linear subspace of L 2 , which we denote by L 2 . Given a function u in 
L 2 , its conditional expectation with respect to x is 

E[UIX] ~ JJdx- 

The conditional expectation £7[tt|x] is the best approximation of u by a func- 
tion of x: 

E[\u - E[u\x]\ 2 ] < E[\u-h{x)\ 2 } 

for all functions h. 

We pick a basis in L 2 , for example hi(x), h 2 (x), . . .. For simplicity assume that 
the basis functions hi(x) are orthonormal, i.e., E[hihj] = 5ij. The conditional 
expectation can be written as = J2 a jhj(x), where aj = E[uhj] = 

E[u(x, x)hj(x)]. If we have a finite number of terms only, we are projecting 
on a smaller subspace and the projection is called a finite-rank projection. In 
the special case where we pick a the finite set of functions h\(x) = xi, h 2 (x) = 
x 2 , ■ ■ ■ , h m (x) = x m , then the corresponding finite-rank projection is called in 
physics the "linear" projection (note that all projections are linear, so "linear" 
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is used to denote that the projection is on linear functions of the resolved 
variables). We should note here that it is not always true that E[xiXj] = for 

The system of ordinary differential equations we are asked to solve can be 
transformed into the linear partial differential equation [21] 

u t = Lu, u(x, 0) = g(x) (14) 

where L = Y,i Ri( x )~it~ an d the solution of (14) is given by u(x, t) = g(4>(x, t)). 
Consider the following initial condition for the PDE 

g(x) = Xj =>- u(x,t) = <f>j(x,t) 

Using semigroup notation we can rewrite (14) as 

_ & Ob 4 Ob A 

dt 3 3 
Let Q = I — P. Equation (14) can be rewritten as [21] 

^e tL Xj = e tL PL Xj + e tQL QL Xj + jf* e {t ~ s)L PLe sQL QLx j( is, (15) 

where we have used Dyson's formula 

i-t 



e tL = e tQL 



+ / e {t ~ s)L PLe sQL ds. (16) 
Jo 



Equation (15) is the Mori-Zwanzig identity [22,23,24]. Note that this relation 
is exact and is an alternative way of writing the original PDE. It is the starting 
point of our approximations. Of course, we have one such equation for each of 
the resolved variables <f>j,j = 1, . . . , m. The first term in (15) is usually called 
Markovian since it depends only on the values of the variables at the current 
instant, the second is called "noise" and the third "memory". The meaning of 
the different terms appearing in (15) and a connection (and generalization) to 
the fluctuation-dissipation theorems of irreversible statistical mechanics can 
be found in [6,25]. 



If we write 

e tQL QLxj = wj, 

Wj(x,t) satisfies the equation 

■§ i w j (x,t) = QLwj(x,t) 
Wj(x, 0) = QLxj = Rj(x) — PRj(x). 

If we project (17) using any of the projections discussed we get 

d 

P—Wj{x,t) = PQLwj(x,t) = 0, 



(17) 
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since PQ = 0. Also for the initial condition 

Pwj(x,0) = PQLxj = 

by the same argument. Thus, the solution of (17) is at all times orthogonal to 
the range of P. We call (17) the orthogonal dynamics equation. Since the solu- 
tions of the orthogonal dynamics equation remain orthogonal to the subspace 
L 2 , we can project the Mori-Zwanzig equation (15) on L 2 and find 

^-Pe tL Xj = Pe tL PLx 3 + P f e {t ~ s)L PLe sQL QL X] ds, (18) 



3.2 The short-time and short-memory approximations 



The approximation we will examine is a short-time approximation and consists 
of dropping the integral term in Dyson's formula (16) 



e tQL « e tL . (19) 

In other words we replace the flow in the orthogonal complement of L 2 with 
the flow induced by the full system operator L. Some algebra shows that 

Q(e sQL -e sL ) = 0(s 2 ), (20) 

and 

J* e {t - s)L PLe sQL QLxjds = J* e ( '~ s)i PLQe sL QL Xj ds + 0(t 3 ). (21) 



As expected the approximation (19) is good only for short times. However, un- 
der certain conditions this approximation can become valid for longer times. To 
see that consider the case where P is the finite-rank projection so PLQe s ® L QLxj = 
T! k=1 (LQe sQL QLx j: h k )h k (x) and PLQe sL QLxj = Y} k=1 {LQe sL QL Xj , h k )h k {x). 
The quantities (LQe sL QLxj, h k ) can be calculated from the full system with- 
out recourse to the orthogonal dynamics. Recall (20) which states that the 
error in approximating e s ® L by e sL is small for small s. This means that for 
short times we can infer the behaviour of the quantity (LQe s ® L QLxj, h k ) by 
examining the behaviour of the quantity (LQe sL QLxj, h k ). 

If the quantities (LQe sL QLxj, h k ) decay fast we can infer that the quan- 
tities (LQe sQL QLxj, h k ) decay fast for short times. We cannot infer any- 
thing about the behaviour of (LQe s ® L QLxj, h k ) for larger times. However, 
if (LQe s ® L QLxj,h k ) not only decay fast initially, but, also, stay small for 
larger times, then we expect our approximation to be valid for larger times. 
To see this consider again the integral term in the Mori-Zwanzig equation. The 
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integrand becomes negligible for t ^> to, where to is the time of decay of the 
quantities (LQe sQL QLxj, hk)- This means that our approximation becomes 



Jo 



f e {t ~ s)L PLe sQL QLx 3 ds 



e^- s)L PLQe sL QLx 3 ds + O(tjj). 



From this we conclude that the short-time approximation is valid for large 
times if to is small and is called the short-memory approximation. On the 
other hand, if to is large, then the error, which is 0{t\), becomes large and 
the approximation is only valid for short times. Note that the validity of 
the short-memory approximation can only be checked after constructing it, 
since it is based on an assumption about the large time behaviour of the un- 
known quantities ( y LQe s< ^ L QLxj,hk). Note, that determination of the quan- 
tities (LQe s ® L QLxj, hk) requires the (usually very expensive) solution of the 
orthogonal dynamics equation. The short-memory approximation, when valid, 
allows us to avoid the solution of the orthogonal dynamics equation. 

If the quantities (LQe sL QLxj, hk) do not decay fast, then we can infer, again 
only for short times, that the quantities (LQe s ® L QLxj, hk) of the exact Mori- 
Zwanzig equation do not decay fast. Yet, it is possible that the quantities 
(LQe sQL QLxj, h k ) start decaying very fast after short times and remain small 
for longer times, so that the short-memory approximation could still hold. 
Of course, this can only be checked a posteriori, after the simulation of the 
short-memory approximation equations. 

In the statistical physics literature, the assumption that the correlations vanish 
for s 7^ is often made which is a special case of the short-memory approx- 
imation with the correlations replaced by a delta-function multiplied by the 
integrals. An application of the short-memory approximation can be found in 



4 A formal comparison of the two methods 

We use the conditions under which the asymptotic mode reduction strategy 
holds to derive conditions under which we can construct a short-memory re- 
duced model in the context of the MZ formalism. This allows us to obtain a 
form of the reduced system in the MZ formalism that is more amenable to 
comparison to the reduced model obtained by AMRS. We should note here 
that it is possible to start from the Chapman-Kolmogorov equation (6) and 
apply a projection formalism similar to Mori-Zwanzig using the operator P not 
P (see e.g. [9] and references therein). Under suitable assumptions (the same 
conditions as the ones we will postulate for the derivation of the short-memory 
MZ model in this section) one can arrive at the reduced Chapman-Kolmogorov 
equation (13). Our intention in this section is different. We want to compare 



[26]. 
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the reduced short-memory model that arises when MZ is applied to the de- 
terministic system (1) to the reduced short-memory model that arises from 
AMRS, where the system (1) is first replaced by the stochastic system (2). 

Write S = -Sq. This is along the lines of introducing explicitly the scaling 
of the stochastic operator in (3) (note that in the following we continue to 
refer to the dynamics of the term as S dynamics and not So dynamics). If, in 
addition, we coarse-grain time t — > et, the system (1) becomes 



£ = i*<,) + i*<*). 



(22) 



The operator L associated with (22) is 



L = \L\ + -L 2 + L 3 (23) 



e 2 ' e 



where 



d 

Li = ^2 S 0:j {x) — , L 2 = C 2 and L 3 = £ 3 . 



From (18) we find for the resolved variable Xj, j — 1, . . . , m 

Pe tL PL Xj + P J* e {t - s)L PLQe sQL QLxjds, (24) 
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where L is given by (23). We want to use the conditions under which AMRS 
holds to simplify the equation (24). Note that AMRS works with the projection 
operator P while the MZ formalism works with the projection operator P 
(we will say more about the relation between the two operators below). The 
asymptotic mode reduction strategy has two steps. In the first step, under 
suitable assumptions, the dynamics described by S are replaced by a linear 
stochastic operator. In the second step, under suitable assumptions which 
guarantee the applicability of Kurtz's theorem, the unresolved variables are 
eliminated and a reduced model for the resolved variables is obtained. Our 
goal here is to translate all these assumptions (conditions) in the language of 
the MZ formalism, and, thus, obtain the conditions under which, for a system 
with L given by (23), it is possible to construct a short-memory model for the 
slow variables. 



The conditions under which Kurtz's theorem holds are, as already mentioned 
in Section 2, 

P£ 2 P = 0, PCi = CiP = 0. (25) 
For the operator P used in the MZ formalism we have , by construction, 
LiP = 0. In the same way that AMRS can be applied if the conditions (25) 
hold, we suggest that similar conditions should hold for the operator P, if a 
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short-memory model is to be found for the special case where L is given by 
(23). Thus, we begin by postulating the conditions 



PL 2 P = 0, PL X = L X P = 0. (26) 

The condition PL 2 P = implies that Pe tL PL 2 Pxj = 0. We note here that the 
conditions (26) hold for the test cases. Under these conditions, the equation 
for the resolved variable Xj becomes 

^-Pe^x, = Pe tL PL 3 P Xj + ^-P f e {t - s)L PL 2 e sQL L 2 Px j( is, (27) 
dt e 2 Jo 

where we have also used the fact that Pxj = Xj. The first term is the analog 
of the first term in (13) and is the contribution of the interactions among 
the resolved variables to the reduced model. The second term in (27) comes 
from the interaction among the resolved and unresolved variables and is the 
analog of the second term in (13). The analogy becomes more clear when 
we examine the meaning of the integrand PL 2 e s ® L L 2 Pxj. The existence (or 
not) of a reduced model in the limit e — > depends on the behaviour of the 
integrand PL 2 e s ® L L 2 Pxj and, in particular, on the behaviour of L 2 e s ® L L 2 Pxj. 
To see this better we can consider the case when P is the finite-rank projection 
on a set of orthonormal functions hi(x), . . . , hi(x). The integral term in (27) 
becomes 
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2e sQ{ ^ Ll+ ^ +L3) L 2 x j , h k )hS(t - s))ds 



The kernels (L 2 e sQ ^ Ll+ ^ L2+L '^ L 2 Xj, hk) are correlations of the resolved-unresolved 
interaction terms under the orthogonal dynamics. Given the specific structure 
of the operator L and the conditions (26), the kernels (L 2 e s ®^ Ll+ ~ L2+L3 ^ L 2 Xj, hk) 



can be approximated, in the limit e — > 0, by [L 2 e'^ LY L 2 Xj, h k ). So, in the limit 
of e — > 0, the orthogonal dynamics are dominated by the S dynamics of the 
unresolved variables. This is also the meaning of the operator C 2 C^C 2 in (13). 
Moreover, if the correlations (L 2 e s ~? Ll L 2 Xj, h k ) decay on a time-scale 0(e 2 ), 
then the integral term contributes to the RHS of (27) at the same order with 
the first term. Equation (27) becomes 



— Pe tL r 

dt 3 



Pe tL PL 3 P Xj + ^P J* e {t - s)L PL 2 e s ^ Ll L 2 P Xj ds, (28) 



Equation (28) is, for the specific L given by (23), the form that the short- 
memory MZ approximation equation for the conditional expectation of the 
resolved variable Xj acquires. 

In summary, for L given by (23), we have that: i) the conditions given by (26), 
ii) the approximation (L 2 e sQi ~? Ll+ * L2+L3) L 2Xj , h k ) « (L 2 e^ Ll L 2 x h h k ) and 
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iii) the assumption that the quantities (L 2 e s ^ Ll L 2 Xj, hk) decay fast, guarantee 
the existence of a short-memory MZ model in the limit e — > 0. The reduced 
model (28) bears great formal similarity to the reduced model (13) obtained 
by AMRS (see also Appendix A in [4] for the integral representation of C~ r ). 
However, there are two main points of difference that we now discuss. 

The first point is the difference between the operators L\ and L\. This differ- 
ence arises from the stochastic approximation (3) in the first step of AMRS. 
As we have already mentioned, it is not easy to justify this approximation in 
the general case [20]. For the test cases the justification was numerical [1]. 

The second point is the difference between the projection operators P and P. 
The operator P is defined by 

P- EE /_ -f c dx, 
JX 

where f c (x,x) = j/^ d - is the conditional density conditioned on the resolved 
variables x. On the other hand, the operator P is defined by 

P = /_ -f,dx, 

JX 

where f s (x) is the invariant density of the dynamics governed by the linear 
stochastic operator appearing in (3). Thus, the question if and how P relates to 
P is reduced to the question of if and how f c is related to f s . It is not obvious, 
if and how f c and f s are related in general. However, for the case when the 
original system (1) admits an invariant density /, the question is ultimately 
connected to the first point mentioned above, i.e. whether the replacement of 
the S dynamics by a stochastic operator is justified. To see that we will need 
a result for the conditional density f c in the limit e — > 0, which we now derive. 
The result states that, if the density / is invariant for the system (1), then, in 
the limit e — > 0, the conditional density f c is invariant under the S dynamics. 

Indeed, if the density / is invariant for the original system (1), then the equa- 
tion for the evolution of / is 

L*f = [±L* 1 + h* + L* 3 ]f = 0, (29) 

where L* is the adjoint of L with respect to the scalar product (, ) defined in 
(12). Note that, by construction 

L*2 — £*2 and L*^ — j0 3 

but, in general, 
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In the limit e — > and, if we assume that the measure remains smooth enough 
to admit a density, the term -^L\f should dominate and (29) becomes 



l;/ = o 



(30) 



By the definition of f c we have f — f c fx fdx and (30) gives 



L\[h [-fdx] = [[fdx]L*J c = 0, 



(31) 



since L\ contains derivatives only with respect to the unresolved variables x 
while fx fdx is a function of the resolved variables x. From (31) we find 



Equation (32) is the result we need. It states that, in the limit e — > 0, the 
conditional density f c is an invariant density for the S dynamics. On the 
other hand, we have, by definition, 



If we assume that the equations (32-33) admit each a unique solution (for the 
same boundary conditions), then the relation between the densities f c and f s 
is governed by the relation between the operators L* and £*. But the relation 
between L\ and C\ is determined by the relation between the operators L\ and 
Ci, i.e. by whether or not we are allowed to replace the fast dynamics S by a 
stochastic operator. So, the relation between the two projection operators P 
and P, which are defined by f c and f s respectively, is determined, in the case 
that / is invariant for (1), by whether or not we are allowed to replace the fast 
dynamics S by a stochastic operator. For the test cases we have f s = f c and 
thus, we expect the reduced models constructed from the two methods to have 
similar predictive ability. The numerical simulations support our expectations. 



5 The models 

The models that we use to compare the two stochastic mode reduction meth- 
ods mentioned above first appeared in [1], where the performance of the 
asymptotic mode reduction strategy was evaluated. We chose these exam- 
ples since, there exist already published results about them for the asymptotic 
mode reduction strategy and all the parameters involved are well-documented. 
All the models that we shall examine have a common structure. They consist 
of a system of deterministic differential equations where a few (two at most) 
slowly evolving variables are coupled to a fast evolving heat bath. The heat 



L*J C = 0. 



(32) 



CJs = 0. 



(33) 
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bath comes from a Fourier- Galerkin truncation of the Burgers-Hopf system 



u t + 7}(U 2 ) X = 0, 



where x G [0, 2ir]. If we expand the solution u(x,t) in Fourier series 



k=i 

we find for the mode Uk 

duk ik t-^ 

— — = — — > uwUk-k', 1 < k < A, 
dt 2 " " 

where A controls the size of the truncation. Also, u^k = u* k and we set in all 
calculations uq = 0. 

As demonstrated numerically by Majda and Timofeyev in [27], this truncation 
(for a large enough number of modes) is a deterministic but chaotic and mixing 
system, that is ergodic on suitably defined equi-energy surfaces, and the time- 
correlations of the modes obey a simple scaling law. These properties qualify 
this system as a good candidate for a stochastic heat bath. The modes of the 
Galerkin truncation will be the unresolved variables that we wish to eliminate, 
thus obtaining a stochastic system for the slowly evolving resolved variables. 
The coupling of the variables on the RHS of the equations will be of triad 
type, i.e. no variable appears on the RHS of its own equation. 

The first model is called the additive case and is given by 

— = A 2^0* VkZk, -TT = -Re— 2^ Uk'Uk-k' + A6 fe ' x ± z k , 

Ik " ik zl ( 34 ) 

at 1 |fc'|<A 

where Uk = Uk + izk- The interaction coefficients are of order 1 and satisfy 

bf z + bf z + bf y = 0. (35) 

The parameter A is a measure of the strength of the coupling between the 
variable Xi and the Burgers bath. The characterization additive is given in 
anticipation of the form of the reduced model for X\ which will have a noise 
term of additive type. 
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The second model is called the multiplicative case and is given by 
~7T = A J2( b k 2Vx 2yk + b 1 J 2z x 2 z k ), ^ = A J2( b k lyx iVk + b^xxZk), 

k k 

dy k D ik ^ y \ 12 

-TT = - Re ^ 1^ u k'U k -h> + A6£' x ± x 2 , (36) 

\k'\<A 

dz k ik ^ xtz|12 

— = -Im— 2^ u k ,u k - k > + A6 fc ' 

dt 2 |fc'|<A 

where the interaction coefficients are of order 1 and satisfy 

b? V + bf y + bf 2 = 0, bf z + bf z + bf 2 = 0. (37) 

The characterization multiplicative is given in anticipation of the form of the 
reduced model for xi,x 2 which will have noise terms of multiplicative type. 

The third model is a combination of the additive and the multiplicative case 
and is called the combined case 

= ^aJ2 b k VZ yk Z k + ^m^Z{b l } 2V X 2 y k + b^ X 2 Z k ) , 
at k k 

-JjT = Kj2 b k VZ VkZk + ^mJ2( b k 1Vx iyk + b 2 ^ Z X x Z k ), 

d k 'k k 

— j = -Re— u k >u k - k > + X a b y k llz xiz k + X a b y k l2z x 2 z k + A m fe| |12 XiX 2 , 

at 1 |fc'|<A 

^ = -Irn^- u k ,u k _ k , + \ a b z k y x x y k + \ a b z J 2y x 2 y k + \ m b z k ll2 x 1 x 2 , 

dt 1 |fc'|<A 

where the interaction coefficients are of order 1 and satisfy 

h l\yz h y\lz h z\ly _ » 0yz ,y\2z ,z\2y _ „ 

°k + °k + °k — u ? °k + °k + — u > /oq\ 

,112)/ . ,21 lj/ . ,2/112 n ,1122 . ,2|l2 . u z\12 n V ' 

V + V + = 0, 6 fe ' + b k [ + b k [ = 0. 

The values of the interaction coefficients used in the numerical experiments 
can be found in [1]. Due to the constraints on the interaction coefficients and 
the incompressibility of the Burgers bath (when u — 0), the additive model 
(34) admits the invariant density 

f(x) = Z- 1 exp(-/3(x 2 + yl + z l + ... + yl + 4)), (40) 

where Z is the normalization constant and (5 is a parameter to be determined 
through the equivalence of the canonical and microcanonical ensembles for 
large enough A. For the multiplicative and combined cases the density 

f(x) = Z- 1 exp(-(3(xl + xl + yl + zl + ...+yl + z 2 A )), (41) 
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is invariant and Z and (3 are defined as in the additive case. To conform with 
the notation of the previous sections we set x = X\ in the additive case or 
x = {x\,X2) in the multiplicative and combined cases, while for all three cases 
x — (yi,Zi, . . . , y\, z\). The form of the density implies equipartition of energy 
among all the variables present in a model (we avoid differentiating among the 
models when there is no risk of confusion) and thus 

Var{x\) = Var{x2) = Var{y\) = . . . = Var{z^) = ——. 

2(3 

For all the numerical simulations (3 = 50 and A = 50. These values are consis- 
tent with an initial condition with energy E = Xi(0) 2 + Sfc=i2/i(0) 2 + ^i(O) 2 
since 

l_ E 

2(3 1 + 2A 

for the additive model. For the multiplicative and combined cases we have 

1 E n 



2(3 2 + 2A' 

where E = xi(0) 2 + X2(0) 2 + Z)fc=il/i(0) 2 + £i(0) 2 . In the numerical simula- 
tions the initial conditions were sampled from the invariant density (40) or 
(41) (depending on the model at hand) by the Box- Mueller method [28] and 
invoking the equivalence of the canonical and microcanonical ensembles. This 
equivalence was shown for the Burgers bath only in [27] for A = 50, but the 
form of the invariant density suggests it should hold also for the coupled sys- 
tems (see also discussion in [4]). This is supported by the fact that, in the 
numerical simulations, the quantities of interest were computed by averaging 
over initial conditions and are identical to the results in [1], where they were 
obtained by time-averaging over a single trajectory. 



6 Construction and application of the reduced models 



We proceed to construct the reduced models for the test cases mentioned 
above. 



6.1 Asymptotic mode reduction strategy models 



For AMRS the reduced models were constructed in [1], so we only have to 
construct here the reduced models for the short-memory MZ approximation. 
However, as promised in Section 2, we mention briefly the methods involved 
in the first step of AMRS (the stochastic approximation step) for the deter- 
mination of the parameters introduced by the stochastic approximation (3). 
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The parameters introduced are the entries of the matrices T, a. For our test 
cases, T = diag('~f 1 , . . . , 7 a) and a = diag[o\ } . . . , cr\). In the numerical simu- 
lations, the resolved variables will be coupled only to the first five modes of 
the Burgers bath. However, this does not change anything about the method 
used to determine the parameters of the stochastic approximation. Since we 
have two parameters, the statistical agreement between 

and 

I = %•*> 

at ~ - (42) 

will have to involve the one-time and two-time statistics of the system (see 
[1] for details). First, we use the one-time statistics to reduce the number of 
parameters. The unresolved variables evolve on a faster time-scale than the 
resolved ones and, thus, are considered to relax quickly to a stationary state. 
For the stochastic approximation (3), the unresolved variables yk,Zk reach a 
stationary state described by a Gaussian density with zero mean and variance 

2 

2^, k = 1, . . . , A. On the other hand, we know from (40) or (41) that every 
unresolved variable is Gaussianly distributed with zero mean and variance 

2 

Thus, the identification = ^ yields a relation between and 7 fc and 
reduces the parameters to be determined to only 7^. To do this, we exploit 
the two-time statistics of the system. For the stochastic approximation (3), 
the correlations exhibit exponential decay and the characteristic decay time 
is 7^" 1 . Of course, the coupling with the resolved variables will alter the decay 
rates so that the correlations do not decay exactly as exponentials. So, the 
first step is to approximate the actual correlation functions by exponentials 

exp(- 7 f 8 |t|) 
2(3 

where 7 f ns is the inverse area below the correlations E[y fe (£)y fe (0)], ~E[zk(t)zk(0)] 
normalized by 2/3: 

7fc ns = (2/3 x area under the actual correlation of mode k) -1 . 

The superscript denotes that the correlations are computed from simulations 
of the full system and E denotes expectation with respect to the density for the 
full system. The second step is to pick the values of 7^ in (42) so as to optimize 
consistency with exp(— r y£ ns \t\)/2p. In [1], there are three different procedures 
suggested to achieve this. The first procedure (PI) uses the scaling law for the 
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correlation times for the Burgers bath [27] to identify the coefficients 7^ 



dk 

where C\ is a constants to be determined by simulations of the Burgers bath. 
The second procedure (P2) sets 7^ = 7^ ns . The third procedure (P3) adjusts 
the 7fc in such a way that the correlations for (42) reproduce the functions 
exp(— ^ ns \t\)/2/3 as close as possible (more details about the procedures can 
be found in [1]). For the additive case the procedure P3 is superior, while for 
the multiplicative and combined cases the procedures P2 and P3 are superior 
and give similar results. The numerical values that we used for the 7^ can be 
found in [1]. 

After we determine the parameters of the stochastic approximation we can 
apply the elimination procedure outlined in Section 2 to obtain the reduced 
system for the resolved variables. The elimination procedure requires the in- 
troduction of a small parameter e that controls the magnitude of the different 
terms that appear in the equations. For the additive and multiplicative cases, 

A 



while for the combined case 

_ max(A a , A m ) 

The next step is to coarse-grain the time (t — > et) and apply the second step 
of the elimination procedure presented in Section 2. 

For the additive case the reduced equations read 

&X\ 

— - = -7x1 + aW{t) 
at 

A 2 (bf z f R (43) 

7 = — > — — — , cr=J-. 

Thus, AMRS predicts that the variable x\ is an Ornstein-Uhlenbeck process. 

For the multiplicative case, the reduced equations for the resolved variables 
read 

— = -A 2 7Xi - A 2 A^iX^i + \a 11 x 2 W 1 {t) + \a 12 x 2 W 2 (t) 



dx 2 ^ 
— = -\ 2 ^x 2 - \ 2 N 2 x\x 2 + Xa^XxWxit) + \a 22Xl W 2 (t) 
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Thus, AMRS predicts that the reduced equations for the resolved variables 
require the introduction of nonlinear terms and multiplicative noises. Here 
Wi(t),W 2 (t) are independent Wiener processes and the various parameters 
are defined as follows. Let 

a = r 1 £ r k \{b? y Y + {bf z n b = ^± %\(bTr + (bfr) 

k=l k=l 

k=i 

(45) 

Then 

l = -\c, N 1 = 0(A + C), N 2 = (3(B + C), (46) 
and the matrix o is defined as 

\ o^^ (Jul 

(47) 




and has the property 



- -T 
oo = 



t A C 



C B, 



(48) 



The matrix in (48) is positive definite and its square root o exists, although 
it is not unique. For the simulations, o was chosen to be symmetric. 



Finally, for the combined case, the reduced equations read 

- A^rci - A^Aqa^Zi + \ m d r nX2W l (t) + \ m o l2 x 2 W 2 (t) 

(49) 



dt 

- nfllXl ~ 712^2 + VllW 3 (t) + 12 W 4 (t), 
= ~ ^ml X 2 ~ \ 2 m N 2 x\x 2 + \ m 2 lXiWi{t) + \ m ^22XlW 2 (t) 



~ ll2X\ ~ 722^2 + <7i2 W 3 (t) + G 22 W 4 (t). 

The parameters Ni, N 2 ,o are given by (46-48) and the parameters for the 
contributions of the additive terms are given by 

\2 A \2 A (l?\v z \2 \2 A i}\yzi?\yz 

A a \°h > „ a I „ A a ST °fc °k /r n \ 

1ii = 5j£— • -* = v£— • 7l2 = i/?£^^' (50) 



The matrix o is defined as 




(51) 



21 



and has the property 



<j u 



T 



7n 7i2 



(52) 



v 7l2 722 J 

and like a it was chosen to be symmetric for the simulations. Finally, the 
processes W\{t),W2{t),W^{t),Wi{t) are independent Wiener processes. 



6.2 Short-memory MZ approximation models 



We construct the short-memory MZ equations for the three test cases. The 
invariant probability density used to compute all the necessary projections 
is given by (40) or (41) depending on the test case. The equations for the 
short-memory approximation are 

^e tL Xj = e tL PL Xj + e tL QL Xj + j* e ( '~ s)L PLQe sL QL Xj ds, (53) 

for j = 1 or j = 1,2 depending on the case. For the systems we examine 
the Markovian term PLxj is identically zero, so QLxj = Lxj. In this special 
case, e t ® L QLxj = e tL Lxj. The conditional expectation in the memory term 
is approximated by a finite-rank projection on an orthonormal basis consist- 
ing of modified Hermite polynomials. The choice of Hermite polynomials was 
motivated by the Gaussian form of the density. The basis is given by 

TO 

h K (x) = l[H Kj , (54) 
i=i 

where 

H K . = H K .((2P(1 + 2a K .))* Xj )(l + 2a Kj )-*e~ a ^ x l (55) 

For the additive case m — 1, while for the multiplicative and combined cases 
m = 2 and the multi- index k = (k\, . . . , K m ). The functions H K . are Hermite 
polynomials (with weight exp(— \xf)) satisfying 



1 k- — 1 
H Q (xj) = 1, H^Xj) = xj, H Kj ( Xj ) = ——XjHK.-ife)-*-* H K 2 {xj) 



Kj V K 3 

(56) 

The derivatives of the functions H K .(x) can be computed by the recursive 
relation 

-^H Kj ( Xj ) = V«J(2/3(1 + 2a Kj )) 1 *H Kj _ 1 (x j ) - 2a Kj (3x j H K .{x j ). (57) 

In general, we allow the different resolved variables to have different scaling 
factors a K ., but in all the numerical simulations the scaling factors were set 
to zero. 
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We also employ the proposition in [25] that, if the probability density is in- 
variant then the operator L is skew-symmetric. For the case of the finite-rank 
projection we find 



for j = 1 or j = 1,2 depending on the case and I denotes the m-tuples of 
indices used in the finite rank projection. Thus, the projection coefficients of 
the memory are correlations of different order of the noise term. The projection 
coefficients in the memory term in (58) are computed by sampling the invariant 
density, evolving the full system and averaging. The noise term is computed 
using the moving average method for sampling stationary stochastic processes 
with given mean and autocorrelation [29]. 

An important issue is how does one decide how many terms and of what form 
are needed in the expansion of the memory term. For cases, where a scale 
separation between resolved and unresolved variables exists and the resolved 
variables include all the slow variables, one way to choose the terms in the 
expansion is to retain only terms whose coefficients decay fast. The larger 
is the ratio of scale separation in the system under investigation, the more 
accurate should this approximation be. One would think that maybe what is 
needed for a better model is the inclusion of slowly decaying memory terms. 
However, this is not necessarily so unless one is willing to solve the orthogonal 
dynamics equation. The reason is that slowly decaying memory terms need to 
be computed through the solution of the orthogonal dynamics equation (see 
also Section 3.2). Otherwise, their inclusion can hamper the accuracy of the 
model (evidence of that for the Kuramoto-Sivashinsky equation was presented 
in [25]). In other words, when faced with a system that exhibits time scale 
separation and all the slow variables are resolved, then the appropriate choice 
for the memory terms (as far as expense /accuracy is concerned) is to pick the 
fast decaying ones. In the limit of infinite scale separation the choice of fast 
decaying terms becomes exact and this is the idea behind the construction of 
AMRS and short-memory MZ. 

The problem with the suggestion of picking only the fast decaying terms is that 
it quickly becomes impractical. To determine how fast the different coefficients 
decay we need to compute them and determine which ones decay fast by 
visual inspection. For a system with even a few resolved variables the number 
of possible combinations of basis functions grows very fast with the order of 
the polynomials used. Additionally, we do not expect all of the fast decaying 
coefficients to be equally important for the accuracy of the reduced model. 
Thus, if we include them all, we may reduce the efficiency of the the model 
unnecessarily. 

We propose here a partial fix of the problem of picking the coefficients that are 




(58) 
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most relevant for systems which exhibit time-scale separation. If the resolved 
variables include all the slow variables we expect, as already mentioned, the 
dominant coefficients in the expansion of the memory term to be fast decaying. 
However, among these fast decaying coefficients the most important ones are 
those with the largest values at s = (see (58)). In other words, if we compute 
the projection coefficients (Lxj, Lh K ), we suggest to keep in the expansion only 
the terms whose coefficient magnitude is substantially larger (e.g. at least a 
couple of orders of magnitude) than the magnitude of the coefficients of the 
rest of the terms. The advantage of this approach is that we can decide which 
terms' coefficients we want to compute by a relatively cheap calculation since 
we do not need to evolve the full system. We applied this criterion to the 
test cases and concluded that the most important terms in the expansion are 
similar to the ones predicted by AMRS. The calculation of the evolution of 
the coefficients for these terms (by integrating repeatedly the full system and 
averaging) showed that they are indeed fast decaying. However, note that there 
is the possibility that a projection coefficient (Lxj, Lh K ) may have a large value 
at s = 0, but the magnitude of the corresponding term h K in the expansion 
can be small, thus making the above procedure not completely failproof. 

We should add here, that the test cases we examine do not exhibit extremely 
large scale separation ratios and this results in errors in the predictions. This 
is to be expected and drives home the point that for an improved prediction 
we should, also, include in the expansion slowly-decaying coefficients of higher 
order, computed through the orthogonal dynamics equation. In MZ, in ad- 
dition to picking up only the fast decaying terms in the expansion, we have 
at our disposal the orthogonal dynamics equation whose solution, although 
expensive, can provide us with slowly decaying projection coefficients even 
for the case where there is no scale separation between the resolved and un- 
resolved variables. Of course, this ability stems from the fact that MZ is a 
reformulation of the problem, hence it allows, in principle, the construction of 
models of arbitrary accuracy. In AMRS, it is not possible, except for special 
cases (see Section 4.5 in [4]), to account for slowly decaying terms. 

It is true that the MZ formalism allows refinements by computing the evolution 
of slowly decaying memory terms. This can be done through the orthogonal dy- 
namics equation. However, there are two sources of additional computational 
expense in this case. One is to solve the orthogonal dynamics equation and the 
other is to solve the random integrodifferential equation for the reduced model, 
which will now have more integral terms. The solution of the orthogonal dy- 
namics equation is not particularly expensive because the quantities needed 
to set it up can be computed in the same process of computing the quantities 
needed for the short-memory model. Of course, there is an additional expense 
to actually solve the orthogonal dynamics equation but this is not as severe. 
The reason is that the solution of the orthogonal dynamics equation can be 
reduced to a solution of a system of linear Volterra integral equations which 
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can be done efficiently (see also [6]). The main expense appears after one has 
constructed the reduced model. First, there are more integral terms now than 
before, and moreover, these terms have slowly decaying integrands. As a re- 
sult, the repeated solution of such a reduced model can become prohibitively 
expensive because the evaluation of the integral terms can be very costly. 



6.3 Numerical simulations 



We present the results of the numerical simulation of the reduced models 
produced by AMRS and MZ for the test cases of Section 5. We should note 
here that the amount of work needed to construct and integrate the reduced 
models is comparable for the two methods. For AMRS, one needs to solve the 
full system repeatedly to compute the entries of the dissipation matrix T. For 
short-memory MZ, one needs to solve the full system repeatedly to compute 
the memory kernels (e sL Lxj, Lh K ). After the reduced models are computed, 
one needs to integrate them. In the case of AMRS, we need to integrate a 
system of stochastic differential equations. In the case of MZ, a system of 
random integrodifferential equations. The fact that the MZ reduced equations 
are random (colored noise) allows the faster convergence of the averaging 
procedure (e.g. to compute correlations). However, the gain in efficiency is 
offset by the fact that we deal with integrodifferential equations. As a result, 
the overall computational effort needed is the same for the two methods. More 
details about the implementation of the reduced models are offered below when 
the examples are analyzed. 

For the additive case the AMRS reduced equation for the resolved variable x\ 
is given by (43) and the MZ reduced equation is given by 

^ = - l\e sL Lx 1 ,LH 1 (x 1 ))H 1 (Mt-s))ds + F 1 (t) 
0i(O)=x 1 (O), 

where 

L = Rxi 'dx~ 1 + ^ Ryk dy~ k + Rzk 'dTk )) LXl =Rxi = X ^ h ^ VZykZk ' 

and Fi{t) is a stationary stochastic process with mean zero and autocorrelation 
(e tL Lxi, Lx\), where the inner product is weighted by the invariant density 
(40). The coupling constant A = 4, the truncation size is A = 50 and (3 = 50. 
Note that due to the short-memory approximation, the interval of the integral 
in (59) is restricted to [0,to]- This is done for two reasons. First, because the 
kernel of the integral (e sL Lxi, LH^Xi)) decays fast and second, because the 
estimate of the kernel can not be accurate for large s by the error estimates 
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Fig. 1. Additive case. Evolution of (e tL Lxi, LHi(xi)). 

for the short-memory approximation. The temporal evolution of the kernel is 
shown in Fig.(l). It was computed by sampling the density (40), evolving the 
full system (34) with the fourth-order Runge-Kutta algorithm and averaging. 
The results shown in Fig.(l) correspond to averaging over 10000 samples. For 
the numerical simulations of (59) we set t = 1. The fact that the kernel 
decays fast motivates its replacement by a delta-function multiplied by the 
integral of the kernel. This approximation will be referred to as the delta MZ 
approximation. To check the relevance of the short-memory approximation, 
we also solved the orthogonal dynamics equation for x\ keeping up to second 
order terms in the expansion of the memory and the temporal evolution of 
(e t Q L Lxi, LHi(xi)) is identical to (e tL Lxi, LHi(xi)). The parameter e = 0.63 
for this case, which is not small, but as seen from the results below it is small 
enough to guarantee the validity of the short-memory approximation and of 
AMRS. 

In [1], the autocorrelation of x\ was used to evaluate the performance of the 
reduction strategy. Fig. (2) shows the predictions for the autocorrelation of 
X\ from AMRS, short-memory MZ and the delta MZ approximation. The 
truth refers to the autocorrelation computed from the full system (in this case 
(34)). As mentioned before, only the first five interactions coefficients were 
nonzero. The truth was computed by sampling the density (40), evolving (34) 
with the fourth-order Runge-Kutta algorithm and averaging. The estimate was 
computed using 10000 samples. The AMRS reduced equation (43) was evolved 
with the Euler scheme for initial values of X\ drawn from the projected on x\ 
form of the density (40) and different realizations of the white noise term. The 
AMRS estimate of the autocorrelation was computed by averaging over 20000 
samples and noise realizations. Finally, the MZ estimates were produced by 
sampling again the projected on x\ density and the noise term i*i(t), evolving 
with the fourth-order Runge-Kutta algorithm and averaging. Note that, the 
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(a) (b) 

Fig. 2. Additive case, a) Autocorrelation of the resolved mode x\. b) Relative error 
of the predictions of the autocorrelation of x\. 

MZ equations have a colored noise term and thus do not require the use 
of a stochastic solver. The short-memory MZ estimates were computed by 
averaging over 1000 samples and noise realizations. The use of a colored noise 
allows for the faster convergence of the averaging procedure. However, the 
gain in computational time by the need to evolve fewer samples is balanced 
by the fact that we have to solve integrodifferential equations for the short- 
memory approximation. The delta MZ approximation, where we no longer 
have integral memory terms, performs surprisingly well, since it has almost 
the same accuracy as the short-memory MZ approximation while being much 
more efficient numerically (about 10 times). The delta MZ approximation was 
computed by averaging over 1000 samples and noise realizations. All three 
estimates have comparable accuracy for short times, while for later times the 
relative error of the AMRS estimate appears to increase reaching a plateau 
of about 20%. The MZ estimates' relative error remains around 3% for the 
interval of integration. 

For the multiplicative case (see (36)), the AMRS reduced equations for Xi,x 2 
are given by (44) while the MZ equations are 



^ =- [ t \e sL Lx 1 ,H 1 (x 1 ))H 1 (cj )l (t-s))ds 
at Jo 

- [ t0 (e sL Lx u L(H 2 (x 2 )H 1 (x 1 )))H 2 ( ( f )2 (t - s))#i(<M* - s))ds + F^t), 
Jo 

=- / <0 (e sL Lx 2 , LH 1 (x 2 ))H 1 ((j) 2 (t — s))ds 
at Jo 

- f t \e sL Lx 2 , L(# 2 (x 1 )^ 1 (x 2 )))iJ 2 (0 1 (t - S ))#!(0 2 (t - s))ds + F 2 (t), 
Jo 

0i(O) =m(0), 2 (O) = x 2 (0), 

(60) 
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where 



dxi dx 2 ^ yk dy k dz k J 

Lxi = R X1 = \J2( b k 2Vx 2yk + b 1 k l2z x 2 z k ), 

k 

Lx 2 = R X2 = A^(^ |1? 'xiy k + bl^x^k), 

k 

and Fi(t),F2(t) are stationary stochastic processes with mean zero and au- 
tocorrelation (e tL Lxi, Lxi), (e tL Lx 2 , Lx 2 ) respectively. The inner product is 
weighted by the invariant density (41). The coupling constant A = 3. The 
interval of integration is again restricted to [0, to] and for the numerical sim- 
ulations to — 2. Fig. (3) shows the temporal evolution of the memory kernels 
(e tL Lxi, LHi(x\)) and (e tL Lxi, L(H 2 (x 2 )Hi(xi))) . The kernels for the equa- 
tion for x 2 have similar behaviour. The kernels decay fast and, as in the pre- 
vious case, we also tried to replace the kernels by a delta function multiplied 
by the integral of the kernel and this approximation is again called delta MZ. 
However, the fact that we have to integrate the kernels up to to — 2 shows that 
they decay more slowly than in the additive case. This is to be expected if we 
look at the form of, say, Lx\. Due to the multiplicative coupling, this quan- 
tity now depends not only on the unresolved variables, but, also, on the slow 
resolved variable x 2 and thus the autocorrelation of Lx\ depends on the auto- 
correlation of x 2 . This results in a slower decay of (e tL Lxi, LHi(x\j), and we 
expect the error of the short-memory approximation to be larger than in the 
additive case. On the other hand, AMRS results in equations for the resolved 
variables that have a multiplicative noise (see (44)), i.e. the dependence on 
the slow variables is taken out of the noise process. This is in accordance with 
the way the AMRS equations are derived, which is in the limit of infinite-scale 
separation. 

The parameter e = 0.41. All algorithmic considerations and numbers of sam- 
ples are the same as in the additive case, except for the algorithm used to 
integrate the AMRS reduced equations (44). We followed [1] and used time- 
splitting, using a second-order Runge-Kutta algorithm for the nonlinear terms 
and the strong Milstein scheme of order 1 for the stochastic terms. 

To check again the validity of the short-memory approximation we solved the 
orthogonal dynamics equations for X\,x 2 keeping up to second order polyno- 
mials in each variable and the results for the kernels were very close to the 
ones predicted by the short-memory approximation. This suggests that we 
expect the short-memory and AMRS estimates to be, at least for short times, 
accurate. Fig. (4) shows the estimates for the autocorrelation of x\ as predicted 
by AMRS, short-memory MZ and delta MZ compared to the truth (similar 
results hold for x 2 ). The MZ estimates are superior to the ones produced by 
AMRS for up to time t « 5, after which the AMRS estimate becomes more 
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(a) (b) 

Fig. 3. Multiplicative case, a) Evolution of (e tL Lx\, LHi{x\)). b) Evolution of 
(e tL Lx 1 ,L(H 2 (x 2 )Hi(x 1 ))). 



accurate. Note that, up to time t w 5 the error for all methods is around 10%, 
thus justifying our expectation of the validity of the short-memory approxi- 
mation for short times. At the instant t = 10 when we stopped the integration 
the relative error of the AMRS estimate is about 25% while the one produced 
by the two MZ variants is about 45%. The larger error compared to the addi- 
tive case confirms our expectations stated above. The fact that delta MZ and 
short-memory MZ give very similar results and that to within second order 
polynomials the orthogonal dynamics kernels are close to the short-memory 
ones, indicates that the short-memory kernels are rather accurate. So, what 
seems to be needed for an improvement of the results is to compute, through 
the orthogonal dynamics, projection coefficients of higher order and, possibly, 
for these coefficients extend the interval of time integration in the memory 
term. We did not attempt this here, because our purpose was to compare 
AMRS and MZ in situations where their numerical efficiency is of the same 
order. 



The last test case we examined is the combined one (see (38)), where the 
coupling of the resolved variables to the unresolved is of both additive and 
multiplicative type. For this case, the AMRS reduced equations are given by 
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Fig. 4. Multiplicative CclSC. Si ) Autocorrelation of the resolved mode x\. b) Relative 
error of the predictions of the autocorrelation of x\. 

(49), while the reduced MZ equations are 



dt 



#2 

dt 



l -=- [ t \e sL Lx 1 ,H 1 (x 1 ))H 1 ( ( f )l (t-s))dt 
Jo 



to 



- f t \e sL Lx 2 ,LH 1 (x 2 ))H 1 ( ( f )2 (t-s))ds 
Jo 



e sL Lx 1 ,L(H 2 (x 2 )H 1 (x 1 )))H 2 (cf )2 (t - s^H^t ~ s))ds + F 1 (t), 



to 



e sL Lx 2 , L( J ff 2 (x 1 ) J ff 1 (x 2 ))) J ff 2 (0 1 (t - s))#i(0 2 (* - s))ds + F 2 (t), 



0i(O) =x 1 (0), 2 (O) =x 2 (0), 



(61) 



where 



d d A d d 

Lxi = R X1 = \ a J2 b k VZ ykZk + ^mJ2( b k 21 ' x 2Vk + bl l2z x 2 z k ), 

k k 

Lx 2 = R X2 = X a ^2b 2 k lvz y k z k + \ m ^(b^ lv xiy k + b 2 k ^ z x x z k ), 



and Fi(t),F 2 (t) are stationary stochastic processes with mean zero and au- 
tocorrelation (e tL Lxi, Lxi), (e tL Lx 2 , Lx 2 ) respectively. The inner product is 
weighted by the invariant density (41). The coupling constants are A a = 
4, \ m = 3. The interval of integration is again restricted to [0,t ] and for 
the numerical simulations to — 1. Note, that while the reduced AMRS equa- 
tions (49) involve terms that come either from the multiplicative coupling part 
or the additive coupling part, the MZ equations (61) involve also cross-terms, 
i.e. products of terms where one factor comes from the multiplicative part 
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(b) 



Fig. 5. Combined ) Evolution of (e tL Lxi, LHi(x±)). b) Evolution of 

(e tL Lx 1 ,L(H 2 (x 2 )H 1 (x 1 ))). ' 
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Fig. 6. Combined case, a) Autocorrelation of the resolved mode x\. b) Relative error 
of the predictions of the autocorrelation of x\. 

and the other from the additive part. Fig. (5) shows the temporal evolution of 
the kernels (e tL Lx 1 , LH^Xi)), (e tL Lx 1: L(H 2 (x 2 )H 1 (x 1 ))). The kernels for the 
equation for x 2 have similar behaviour. The kernels decay fast and, as in the 
previous case, we also tried to replace the kernels by a delta function multi- 
plied by the integral of the kernel and this approximation is again called delta 
MZ. The parameter e = 0.49. All algorithmic considerations and numbers of 
samples are the same as in the multiplicative case. 

Fig. (6) shows the estimates for the autocorrelation of Xi as predicted by 
AMRS, short-memory MZ and delta MZ compared to the truth (similar re- 
sults hold for x 2 ). The MZ estimates are superior to the ones produced by 
AMRS for up to time t m 5. After that and until time t«9, the AMRS esti- 
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mate's accuracy is higher. For the time interval shown here, the relative error 
for all three methods has a maximum value around 25%. If we compare the 
accuracy of the results for MZ and AMRS taking into account their behaviour 
in the pure additive and multiplicative cases, we see that the performance in 
the combined case is to be expected, since it is a mixture of the advantages 
and drawbacks of each individual method as manifested in the additive and 
multiplicative cases. The fact, that MZ exhibits higher accuracy for a larger 
interval of time, should be attributed to the fact that the additive coupling 
constant is larger, thus bringing the combined case somewhat closer to the 
additive case (see also discussion of results for the combined case in [1]). 



7 Conclusions 

We have presented numerical results comparing two stochastic mode reduc- 
tion strategies. The first method (AMRS), proposed by Majda, Timofeyev and 
Vanden-Eijnden, is based on an asymptotic strategy developed by Kurtz. The 
second method is a short-memory approximation of the Mori-Zwanzig pro- 
jection formalism of irreversible statistical mechanics, as proposed by Chorin, 
Hald and Kupferman. The novel feature of these methods is that they allow 
the use, in the reduced system, of higher order terms in the resolved vari- 
ables. The two methods were applied to a collection of test cases that exhibit 
separation of time-scales between the resolved and unresolved variables and, 
also, share some features with more complicated models used in the study of 
climate dynamics. 

Depending on the test case, one of the two methods can have superior accuracy, 
but the overall behaviour suggests that for cases with separation of time-scales, 
the two methods result in reduced systems of equations that have similar 
predictive ability. For the test cases we examined, the replacement of the 
kernels in the memory terms by delta-functions (the delta MZ approximation) 
does not appear to be very harmful to the accuracy of the approximation, 
while at the same time it makes the integration of the reduced equations 
around 10 times faster. The test cases highlight the limitations of AMRS and 
of short-memory MZ when the separation of time-scales becomes less sharp. 
In this case, MZ allows for a systematic, although expensive, calculation of 
reduced equations that incorporate long-time memory effects. On the other 
hand, AMRS, by construction, cannot be readily extended to these cases. For 
special cases, like the multiplicative case above (see also Section 4.5 in [4]), 
the reduction performed by AMRS can be effected by working directly on 
the stochastic differential equations (4), without recourse to the associated 
Chapman-Kolmogorov equation. This allows for a systematic development of 
reduced equations that account for long-time memory effects. 
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It would be interesting to compare the two methods when applied to more 
realistic models e.g. equations for climate dynamics, where a separation of 
time-scales is known to exist between the quantities of interest and the huge 
number of faster variables that constitute the climate system [30]. 
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